Telegram Group & Telegram Channel
🖥 PDF CRAFT-a python library for converting PDF (primarily scanned books) in Markdown and EPUB using local AI models and LLM to structure the contents
Github

Basic possibilities

- extracting text and layout
Uses the combination of Doclayout-Yolo and its own algorithms for detecting and filtering headlines, columns, footnotes and page numbers

- Local OCR
Recognizes the text on the page via Onnxocr, supports acceleration on GPU (CUDA)

- Determining the order of reading
With the help of LayoutReader, it builds a flow of text in the order in which it is perceived by a person

- Converting in Markdown
Generates .MD with relative links to images (illustrations, tables, formulas) in the Assets folder

Installation and requirements
Python ≥ 3.10 (recommended 3.10.16).

Pip Install PDF-Craft and PIP Install Onnxruntime == 1.21.0 (or Onnxruntime-GPU == 1.21.0 for CUDA).

For an EPUB conveier, you need access to the LLM service (for example, Deepseek).

🟡 Github


#پایتون #Python #library

🆔 @Python4all_pro



tg-me.com/Python4all_pro/1585
Create:
Last Update:

🖥 PDF CRAFT-a python library for converting PDF (primarily scanned books) in Markdown and EPUB using local AI models and LLM to structure the contents
Github

Basic possibilities

- extracting text and layout
Uses the combination of Doclayout-Yolo and its own algorithms for detecting and filtering headlines, columns, footnotes and page numbers

- Local OCR
Recognizes the text on the page via Onnxocr, supports acceleration on GPU (CUDA)

- Determining the order of reading
With the help of LayoutReader, it builds a flow of text in the order in which it is perceived by a person

- Converting in Markdown
Generates .MD with relative links to images (illustrations, tables, formulas) in the Assets folder

Installation and requirements
Python ≥ 3.10 (recommended 3.10.16).

Pip Install PDF-Craft and PIP Install Onnxruntime == 1.21.0 (or Onnxruntime-GPU == 1.21.0 for CUDA).

For an EPUB conveier, you need access to the LLM service (for example, Deepseek).

🟡 Github


#پایتون #Python #library

🆔 @Python4all_pro

BY پایتون ( Machine Learning | Data Science )


Warning: Undefined variable $i in /var/www/tg-me/post.php on line 283

Share with your friend now:
tg-me.com/Python4all_pro/1585

View MORE
Open in Telegram


پایتون Machine Learning | Data Science Telegram | DID YOU KNOW?

Date: |

That strategy is the acquisition of a value-priced company by a growth company. Using the growth company's higher-priced stock for the acquisition can produce outsized revenue and earnings growth. Even better is the use of cash, particularly in a growth period when financial aggressiveness is accepted and even positively viewed.he key public rationale behind this strategy is synergy - the 1+1=3 view. In many cases, synergy does occur and is valuable. However, in other cases, particularly as the strategy gains popularity, it doesn't. Joining two different organizations, workforces and cultures is a challenge. Simply putting two separate organizations together necessarily creates disruptions and conflicts that can undermine both operations.

Telegram and Signal Havens for Right-Wing Extremists

Since the violent storming of Capitol Hill and subsequent ban of former U.S. President Donald Trump from Facebook and Twitter, the removal of Parler from Amazon’s servers, and the de-platforming of incendiary right-wing content, messaging services Telegram and Signal have seen a deluge of new users. In January alone, Telegram reported 90 million new accounts. Its founder, Pavel Durov, described this as “the largest digital migration in human history.” Signal reportedly doubled its user base to 40 million people and became the most downloaded app in 70 countries. The two services rely on encryption to protect the privacy of user communication, which has made them popular with protesters seeking to conceal their identities against repressive governments in places like Belarus, Hong Kong, and Iran. But the same encryption technology has also made them a favored communication tool for criminals and terrorist groups, including al Qaeda and the Islamic State.

پایتون Machine Learning | Data Science from br


Telegram پایتون ( Machine Learning | Data Science )
FROM USA